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Abstract 

The current view of galaxy formation holds that all massive galaxies harbor a massive black hole at 
their center, but that these black holes are not always in an actively accreting phase. X-ray emission is 
often used to identify accreting sources, but for galaxies that are not harboring quasars (low-luminosity 
active galaxies), the X-ray flux may be weak, or obscured by dust. To aid in the understanding of 
weakly accreting black holes in the local universe, a large sample of galaxies with X-ray detections 
is needed. We cross-match the ROSAT All Sky Survey (RASS) with galaxies from the Sloan Digital 
Sky Survey Data Release 4 (SDSS DR4) to create such a sample. Because of the high SDSS source 
density and large RASS positional errors, the cross-matched catalog is highly contaminated by random 
associations. We investigate the overlap of these surveys and provide a statistical test of the validity of 
RASS-SDSS galaxy cross-matches. SDSS quasars provide a test of our cross-match validation scheme, 
as they have a very high fraction of true RASS matches. We find that the number of true matches 
between the SDSS main galaxy sample and the RASS is highly dependent on the optical spectral 
classification of the galaxy; essentially no star-forming galaxies are detected, while more than 0.6% 
of narrow-line Seyferts are detected in the RASS. Also, galaxies with ambiguous optical classification 
have a surprisingly high RASS detection fraction. This allows us to further constrain the SEDs of low- 
luminosity active galaxies. Our technique is quite general, and can be applied to any cross-matching 
between surveys with well-understood positional errors. 

Subject headings: galaxies: active — X-rays: general — X-rays: galaxies — quasars: general 



1. INTRODUCTION 

Distinguishing the processes that contribute to the 
emission from the centers of galaxies is vital to un- 
derstanding the co-evolution of galaxies and their cen- 
tral black holes. Among nearby galaxies, a large frac- 
tion of central emiss ion sources are of ambiguous na- 
ture (|Ho et al.lli"99l : emission-lines in optical spectra 
of many galaxies seem to reflect a mix of behavior be- 
tween bona-fide accretion (Seyfert-like) and active star 
formation (H II- like). In order to discriminate between 
the various possible ionization mechanisms and penetrate 
the obscuring dust layers that encircle these sources, we 
need observations at multiple wavelengths. In partic- 
ular, X-rays are less prone to dust absorption and thus 
can be used to distinguish between accretion sources and 
emission from young, hot stars. This can clarify the ob- 
served optical emission spectra and allow us to better 
describe the central accretion sources in low-luminosity 
active galactic nuclei (AGN). 

For an accurate census of the local galactic popu- 
lation, one must study a statistically significant num- 
ber of sources. The Sloan Digital Sky Survey (SDSS, 
lYork et all l2000f ) provides the largest sample of galax- 
ies with spectra which allow emission-line classifica- 
tion of central sources. The ROSAT All-Sky Survey 
(RASS, IVoges et alJl 19991 I200C1 ) is the widest and deep- 
est survey of the X-ray sky. The SDSS and RASS are 
well matched in terms of depth, but have quite differ- 
ent astrometry and spatial resolution. Previous stud- 
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ies matching a variety of SDSS and RASS sources in- 
clude analyzing the X-ra y properties of spect r oscop i- 
cally confirmed quasars (jAnderson et al.l I2003L l2007f ). 
;enerating an X-ray detected galaxy cluster catalog 
Popesso et al.l 120041) . searching for opti cally unidenti- 



fied neutron stars (jAgiieros et al.l [2006), and survey- 
ing the multi-wav elength properties of SDSS galaxies 
(|Obric et al.ll200l . 

Because broad-line quasars are expected to be strong 
X-ray sources, one would expect a large number of 
matches between RA SS and SDSS for these objects. 
lAnderson et al.1 (|2003| ) characterized the RASS prop- 
erties of spectroscopically identified broad-line quasars 
from the SDSS as well as some narrow-line sources. They 
qualitatively discuss the likelihood that a given RASS- 
SDSS match is a true match and include "normal" (non 
or weakly emitting) galaxies as a comparison of what a 
weak correlation would look like. They study more than 
1000 RASS-SDSS quasar/ AGN and briefly discuss a few 
properties of the sample. Their sample reproduces the 
expected non-linear optical/X-ray (2500A/2keV) rela- 
tionship among broad- line sources. The follow-up study, 
lAnderson et~atl (|2007t ). examines ~ 7000 sources with 
similar results. 

A different investigation involves identifying RASS 
sources with no obvious optical counterpart. For exam- 
ple, this is use f ul for finding optically dim neutron stars. 
lAgueros et all ((20061) identified all SDSS sources within 
4 times the positional error of each RASS source. They 
then removed from their catalog any RASS source with 
an SDSS match which could have produced the X-ray 
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flux. After removing objects with NED identifications, 
visually-identified bad fields, and known galaxy clusters, 
11 RASS sources with no plausible SDSS optical coun- 
terpart remained. They claim this number is consistent 
with the number of isolated neutron stars expected in the 
SDSS field. Studying poorly understood matching sam- 
ples in this way can clarify whether the sample includes 
primarily true matches or primarily false matches. 

A recent compa rison of RASS an d SDSS in a multi- 
wavelength study (jObric et al.ll2006D identified 267 RASS 
matches within 30" of SDSS DR1 main sam ple galaxies 
([Strauss et alJ l2002t lAbazaiian et al.ll2003h . They list 
a false association fraction of ~ 9% (computed statisti- 
cally based on the RASS source density) and also show 
the positions of their galaxies on an optic al emission-line 
classi fication diagram (the BPT diagram: iBaldwin et al.l 
119811). They did not in vestigate known-bad matches (as 
in lAgiieros et~all[200l . nor did they elaborate on the 
positions of the RASS detected galaxies on their BPT 
diagram. 

The ROSAT All Sky Survey was produced from data 
acquired in ROSAT's scanning mode, but ROSAT also 
performed many individual targeted observations, result- 
ing in several pointed catalogs. These catalogs were gen- 
erated from serendipitous source discoveries made during 
individual targeted observations. Because of this, they 
contain a large number of sources in very small fields 
scattered over the sky with highly varying exposure du- 
rations, making source upper limits difficult to compute. 
Previous studies (e.g. [Suchkov et al.l l2006h have exam- 
ined the properties of SDSS quasars found in these cat- 
alogs; 

iSchulte-Ladbeck et al.l (|2005l ) looked at star forming 
galaxies in the SDSS DR1 and matched them to several 
different ROSAT catalogs, including the RASS. Their fi- 
nal results involve 14 star forming galaxies which they 
claim to be X-ray sources (four of which were previously 
studied). We were not able to determine exactly which 
catalog they used in their published results. There- 
fore, we cannot check whether the results represent true 
matches between RASS and SDSS. Some star-forming 
galaxies are expected to be X-ray emitters, but whether 
these galaxies are actually detected in RASS remains to 
be seen. 

The XMM-Newton and Chandra X-ray satellites both 
provide much improved pointing, resolution and depth 
over ROSAT, but their fields of view are quite small. 
Both have produced serendipitous source catalogs similar 
to the ROSAT pointed catalogs mentioned above. The 
initial XMM serendipitous source cat alog was compared 
with the USNO A2.0 optical catalog (fCeorgakaki s et aTl 
l2006h to find 46 optically i dentified non-AGN galaxies 
with substantial X-ray flux. iHornschemeier et al.l (|2005l ) 
matched ser endipitous source dete ctions in Chandra with 
SDSS DR2 l|Abazaiiaii et al.ll2004h to find 42 X-ray emit- 
ting galaxies of a variety of types. The XMM-slew sur- 
vey (jFrevberg et al.ll2005[ ) aims to solve the field of view 
and uniformity problems by taking data during space- 
craft slews between targets. It will produce an all-sky 
map of equivalent depth to RASS, with more than six 
times better resolution and pointing accuracy, in roughly 
6 years. 

In this paper, we investigate the accuracy of matching 
RASS sources with SDSS galaxies. In Section [2] we de- 



scribe the data sets used in this study, including the sys- 
tematics of selecting an appropriate galaxy sample from 
SDSS. The details of the cross-matching procedure and 
the statistical methods are described in Section [3] and 
the final matched data sets, separated by galaxy spec- 
troscopic class are detailed in Section |U We find that a 
RASS/SDSS galaxy match cannot be trusted to repre- 
sent the galaxy's true X-ray flux without first identifying 
the galaxy's spectral type. Section [5] provides a prelim- 
inary analysis of the new XMM-slew catalog and shows 
its utility in clarifying the presence of X-ray sources in 
galaxies. 

2. DATA 
2.1. SDSS 

This study employs data from the Sloan Digital Sky 
Survey Data Release 4 (SDSS DR4), an optical imag- 
ing and spectroscopic survey with spectrosc opic cover- 
age o f ~ 1 6% of the sky as describ e d in [York et all 
(2000) and lAdelman-McCarthv et all (|2006h . Techni- 
cal details of the photometric camera, telescope, anal- 
ys is pipeline, monito r and related s ystem s can be found 
inlGunn et alJ (11998D. iGunn et alJ (120061) .iLupton et all 
(120011). iHogg et all (|2001h and lSmith et all (|2002f) . while 



iPier et al.1 (l2003h descri be the astrometric calibration 
and lFukugita et aT1ll996l describe the u'g'r'i'z' photomet- 
ric system. The SDSS has very good photometric reso- 
lution (~ 1".4 PSF) and astrometric precis ion (< 0".l 
rms per coordinate) (|Stoughton et al.ll2002f ). The spec- 
troscopic sur vey is constructed fr om tilings of the photo- 
metric data ([Blanton et alJ 120031 ) and includes the main 
galaxy sample, quasar sample an d luminous red galax y 
sample which are d escribe d in IStrauss et al.1 (120021) . 
iRichards et"aT] (|2002t ). and lEisenstein et alJ (|2001l ) re- 
spectively. This spectroscopic survey includes uniform, 
high quality spectra of more than half a million galax- 
ies and nearly 100,000 quasars, via plates containing 3" 
diameter optical fibers. 

2.1.1. SDSS Galaxies 

Our focus is on the main galaxy spectroscopic sample 
which includes all galaxies with Petrosian r magnitudes 
brighter than 17.77 with the exception of those not ob- 
served due to fiber collision. Because of the size of the 
fiber-plugs, spectroscopic targets for a single plate must 
be separated by at least 55". This was more of a problem 
in DR1 and DR2, before overlapping plates and follow- 
up observations filled in many of the missing objects. 
The complete SDSS spectroscopic catalog includes more 
galaxies with spectra than just the main galaxy sample. 
We restrict ourselves to the main galaxy sample to avoid 
sample bias; some SDSS objects were selected for spec- 
troscopy due to the ir proximity to FIRST radio sources 
([Becker et al. 1995) and/or RASS X-ray sources. See the 
appendix for details on our SDSS source selection pro- 
cess, and the importance of using the main galaxy sample 
in cross-matching studies. 

SDSS studies at MPA/JHU produced a catalog 1 of sec- 
ondary source pro ducts generated from the SDSS spec- 
troscopic data (see iBrinchmann et al.ll2004l for the DR2 

1 Data catalogues from SDSS studies at MPA/JHU 
http : //www .mpa-garching.mpg. de/SDSS/ 
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Fig. 1. — Emission-line galaxy classification diagram used to 
separate H lis, Transitions, LINERs and Seyferts. "Unclassified 
emission" galaxies are those which lie in a different region in each 
diagram. 

catalog paper) . This catalog includes simultaneous mea- 
surements of the emission and absorption line profiles. 
The complete catalog includes all objects in SDSS (re- 
gardless of magnitude) that are spectroscopically identi- 
fied as galaxies; sources with emission-line widths greater 
than lOOOkm/s are not included (thus all objects iden- 
tified as Seyferts and LINERs in this paper are type 2 
objects). We restrict ourselves to the intersection of the 
MPA/JHU catalog and the SDSS DR4 main galaxy sam- 
ple described above. 

2.1.2. Spectral Classification of Main Sample Galaxies 

We classify galaxies based on their optical emission- 
line properties. Galaxies showing at least a 2a detection 
of flux in the emission- features Ha, H/3, [O III], [N II], [S 
II] and [O I] are classified as emission-line galaxies, while 
those that show some but not all of these lines are called 
"unclassifiable" galaxies. The strong line emitters are 
further separated into sources dominated by accretion 
and those dominated by light from hot, young stars. We 
classify H lis, Seyferts, LINERs and Transition objects 
based on their positions in a 4-dimensional space defined 
by the line-flux ratios [O III]A5007/H/3, [N II]A6583/Ha, 
[S II]AA6716,6731/Ha, an d [O I1A6300/Ha. W e use the 
classification criteria from iKewlev et all (|2006h . Fig. [1] 
shows the regions defining each galaxy subclass. We call 
those galaxies that do not lie in the same classification 
region in each diagram, "unclassified emission" galax- 
ies. Finally, galaxies showing no signs of emission in 
Ha, H/3 and [O III] are classified as "Passive" galaxies. 
M ore details on thi s class ification scheme can be found 
in lConstantin et alJ (|2007h . 

2.2. ROSAT All Sky Survey 

Over the course of its eight year mission, the Rontgen- 
satellit (ROSAT) produced a variety of distinct source 
catalogs from its two X-ray detectors, the Position Sensi- 
tive Proportional Counter (PSP C) and High Resol ution 
Imager (HRI). The WGACAT (|White et al.l l2000h and 
2RXP are serendipitous catalogs from pointed ROSAT 
observations covering ~ 15% and ~ 17% of the sky, re- 
spectively. The High Resoluti on Imager catalog (1RXH, 
IROSAT Scientific Team! [2000h covers ~ 2% of the sky 
with much greater precision. 

The PSPC scanning-mode data are the primary focus 
of this study: th e RASS Faint Source Catalog (FSC, 
IVoees et al.1 l2000D and RASS Bright Source Catalog 
(BSC. IVoees et al.l I1999H together covering 92% of the 
sky. We restrict ourselves to the RASS because we would 
eventually like to compute source upper- limits. The av- 
erage integration time per target in the RASS varies 
between < 100 seconds for sources near the equator to 
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Fig. 2. — RASS positional errors (rms) in the Faint Source 
Catalog (FSC) and Bright Source Catalog (BSC). The numbers 
after the hash (#) in this, and all subsequent histograms, give the 
total number of points included in that histogram. 

> 5000 seconds for sources near the ecliptic poles, with 

> 97% receiving more than 100 seconds. 2 

The ROSAT PSPC operated between 0.2 and 2.4 keV, 
with the highest sensitivity and resolution at roughly 1 
keV. The PSPC optics were focused for IkeV X-rays re- 
sulting in a la PSF of roughly 25" at that energy. The 
resolution is worse for both higher energy (poor focus) 
and lower energy (diffraction limit) X-rays. The scan- 
mode observations that produced the RASS resulted in 
an astrometric positional error (la statistical error plus 
a 6" systematic error) of 10-20" (Fig. [2]). We show in 
Section [3~3l that the 6" systematic error is likely overes- 
timated; 3" is likely more correct. 

2.3. XMM-Newton Slew Survey 

The RASS catalog is the current best compromise be- 
tween width and depth for X-ray data, but it has limita- 
tions, as noted above. To produce an improved catalog, 
the X-ray Multi Mirror satellite (XMM-Newton) is col- 
lecting X-ray counts during slews between targeted ob- 
servations. The first release of the XMM- Newton Slew 
Survey (XMM-slew, iFrevberg et~aT1 |2005() covers 6240 
square degrees of sky, in narrow north-south slews, us- 
ing the EPIC-pn CCD because of its large detector area, 
fast read-out rate and high sensitivity to hard X-rays. 
Although average exposure time is only ~ 10s for any 
given source, the large mirror area and sensitive detector 
make it nearly as deep as the RASS in the soft band (0.2- 
2keV), and deeper and wider than any previous survey in 
the hard band (2-12keV). The quoted 8" positional error 
along the slew direction is dominated by the accuracy of 
the attitude reconstruction. The EPIC-pn resolution of 
4" is roughly a factor of 6 better than the RASS resolu- 
tion, thus XMM-slew can resolve many of the confused 
RASS sources. 

Two XMM-slew catalogs were released, a "total" cata- 
log containing all detected sources, and a "clean" catalog 
with known bad sources removed and a higher detection 
threshold. We examine the clean sample in this study; 
it contains 2713 sources with detections in at least one 
band. 
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Fig. 3. — An example of RASS source confusion (North is up). 
SDSS g-band is shown in green (spectroscopic classifications are la- 
beled), RASS pixels in blue (white X marks the source center) and 
FIRST sources in red (both quasars are FIRST sources). Notice 
that the RASS source covers two SDSS spectroscopic galaxies but is 
centered on neither. There are many other cases where there is no 
obvious source for the X-ray emission besides a single SDSS galaxy, 
because of lack of SDSS spectroscopic information about all sources 
in the field. The resolved quasar is SDSS J101643.87+421027.5 for 
reference. 

3. CROSS-MATCHING 

Cross-matching two surveys is simple enough: count 
all objects separated by less than some threshold dis- 
tance (in our case, 60") as possible matches. But the 
validity of such a match depends on the differing sky cov- 
erage, sensitivity, positional accuracy and spatial resolu- 
tion of the two matched surveys. These differences lead 
to matches due to purely random associations, multiple 
cross-matches for single sources, and erroneous flux mea- 
surements due to contributions from multiple sources. 
For example, the ROSAT PSPC is more than an order 
of magnitude worse than the SDSS in both resolution 
and astrometry, and the SDSS source density is much 
higher. Understanding the RASS-SDSS galaxy sample 
is particularly difficult for sources that are not necessar- 
ily expected to be strong X-ray emitters, such as spec- 
troscopically identified low-luminosity narrow-line AGN, 
passive or starburst galaxies. In this section, we attempt 
to quantify the true and random components of RASS- 
SDSS cross-matches. 

Fig. [3] illustrates an example of the issues faced in 
matching RASS and SDSS. Here a RASS source over- 
laps two spectroscopically identified SDSS galaxies and 
is not centered on either of them. One of the galax- 
ies hosts a quasar and thus is the likely source of the 
X-ray flux, while the other is identified as a star-forming 
galaxy (H II-type optical spectrum) and thus is expected 
to contribute little to the X-ray flux. If the quasar 
were unidentified — because it had no spectrum taken — 
the star-forming galaxy could have been considered the 
X-ray source. Another problem is that the center of the 
X-ray source does not coincide with any of the optical 
sources. This could be simply due to the astrometric er- 
rors in the RASS catalog (Fig. [2]), or to contributions 
to the total X-ray emission from the other quasar at the 
top of the image. This example is not singular: there are 
many such confusing matches in the RASS-SDSS galaxy 
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Fig. 4. — Source-separation histogram for SDSS sources that are 
expected to have a RASS identification. "Quasar" includes spec- 
troscopically identified quasars, "blue" includes point sources with 
u—g < 0.6, "blue2" is a subset of "blue" restricted to 15.5 < u < 21 
and "random" is a random match between SDSS galaxies and 
RASS (see Section 13.21 . The lower plot shows the "blue" sample, 
with the magnitude cuts included. Note the vastly different tails 
for the three quasar-like samples, implying a significantly different 
contamination fraction in each. 

sample because of the high SDSS source density. Also, 
this RASS source is relatively bright, and thus has bet- 
ter centroiding (positional error given as 8") than most 
RASS sources and was particularly easy to catch. 

3.1. Obvious X-ray emitters 

When an SDSS object is the actual source of the RASS 
X-rays (a true match), the distance between the X-ray 
and optical source positions should be small. Some ob- 
vious choices for true matches are quasars and quasar 
candidates. To qualitatively assess whether these "ob- 
vious" choices are correct, we plot the distribution of 
distances between the center of the RASS and SDSS 
sources — the source-separation histogram — for these par- 
ticular systems in Fig. 0J The upper panel includes 
the following RASS-matched SDSS sources: spectroscop- 
ically identified quasars, sources with u — g < 0.6 (quasar 
candidates), a subset of the quasar candidates restricted 
to 15.5 < u < 21 and "random match" between galaxies 
and the RASS, as described in the following section. The 
lower panel shows the u-magnitude vs. source-separation 
distribution for blue sources. Note the clustering of 
points at small source-separations for 15.5 < u < 21, 
suggesting that these are true RASS-SDSS matches. 

From the upper panel, spectroscopically identi- 
fied quasars show an obvious peak at small source- 
separations. "Blue" objects (all SDSS sources with 
u — g < 0.6), which include some objects in the "quasar" 
sample, have a peak at small separations as well as a 
prominent tail. The "blue2" sample (subset of "blue" 
with 15.5 < u < 21) has a much smaller tail, suggesting 
a smaller fraction of incorrect matches. Out of these sam- 
ples, spectroscopically identified quasars appear to rep- 
resent the most reliable RASS-SDSS cross-match, with 
the fewest points with large separations. 

3.2. Purely random matches 

Incorrect cross-matches between catalogs are due to 
random associations between optical and X-ray sources. 
Previous work estimated the random contamination by 
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RASS-SDSS quasar matches: simulated vs. real 
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Fig. 5. — Simulated quasar source-separation histogram for cross- 
matching between RASS and SDSS quasars. Note the differing 
tails: the real distribution does not fall off as quickly, as there is 
a small fraction of random matches at large radii. The thin upper 
curve gives the true matching fraction at that radius (percent, right 
axis). 

comparing the source density of the two catalogs, which 
works well for samples with a small random contami- 
nation fraction. We model these incorrect matches by 
generating "offset" SDSS object catalogs and matching 
them to the RASS. We produced 10 such offset catalogs 
each from the SDSS galaxy and quasar catalogs by offset- 
ting all objects (either galaxies or quasars, respectively) 
from their true RA and Dec by a fixed amount in a fixed 
direction, with a different offset and direction for each off- 
set catalog to reduce systematic effects. The maximum 
offset was 1° in RA and Dec. This procedure preserves 
the on-sky source distribution of the SDSS, while moving 
sources far away from their original RASS associations. 
When these catalogs are matched to the RASS, the re- 
sult is a linearly increasing source-separation histogram, 
oc r; as the radius increases, more sources fall within 
the matching circle. We compare these random catalogs 
with our galaxy or quasar RASS matches to determine 
the fractional contamination by purely random associa- 
tions. 

3.3. Confirming Quasars 

X-ray source positional measurements have indepen- 
dent, normally distributed errors in both planar compo- 
nents. This is analogous to darts thrown at a small tar- 
get. The precision of each throw is known, but individual 
throws may have different precisions. The distribution of 
dart-target distances is given by a Rayleigh distribution 
having a probability density function (PDF), 



with scale parameter a and separation distance r. In the 
case of X-ray measurements, the positional precision, a 
is affected by the X-ray flux (reliability of centroiding 
depends on the number of X-rays) and the pointing ac- 
curacy and resolution of the measuring apparatus. The 
precision of each RASS source measurement is listed in 
the catalog as the positional error (Fig. [2|). 

We reproduce the source-separation histogram for 
RASS-SDSS quasar matches by simulating X-ray source 
measurements using the corresponding RASS positional 



RASS-SDSS galaxy matches: simulated vs. real 
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Fig. 6. — Reconstructed ROSAT/SDSS galaxy source-separation 
with ~ 36% true matches + ~ 64% random matches out to 60". 
The thin curve from the upper left to the lower right, cutting 
through the histogram, is the true matching fraction (percent, right 
axis). There is very good agreement between the total simulation 
curve and the actual distribution. 

errors plus a small random component. Because the 
RASS positional errors are dependent on the X-ray flux, 
we use the positional errors from the RASS-SDSS quasar 
matched catalog. For each such RASS source, we gen- 
erate a Rayleigh distribution with the positional error 
of that source as the scale parameter a. The sum of 
the probability distribution function from each source 
gives our "simulated true match" curve. This PDF is 
the parent distribution for the true matches between 
RASS and SDSS quasars. Random associations be- 
tween RASS and SDSS quasars have a linearly increasing 
source-separation histogram, as shown above. A linear 
combination of these two distributions (simulation PDF 
and random straight-line) should reproduce the observed 
RASS-SDSS quasar source-separation histogram. 

We show the quasar source-separation histogram, sim- 
ulated true match curve, and random component in Fig. 
[5j The simulation curve, which does not include the 
random component, matches the actual quasar source- 
separation histogram very well except at the tail end. 
Combining the simulation and random components via 
a ^-minimization on the amplitude of each component 
yields an excellent fit. The total fit is not shown in Fig. 
[5] because it would be completely masked by the data. 
This fit has a \ 2 P er degree of freedom of 0.68. However, 
the distributions match only if the RASS positional er- 
rors are all reduced by 3", implying that the quoted 6" 
systematic offset was overestimated. 

The thin upper curve in Fig. [5] gives the "true match- 
ing fraction" for RASS-SDSS quasar matches (percent, 
right axis). This is the number of true matches (simula- 
tion curve) divided by the total fit (simulation+random) 
at that radius. Note that at 30", about 90% of the RASS- 
SDSS quasars matches are legimate. We also find that 
at 60" there is ~ 6% total contamination to the RASS- 
SDSS quasar ca t alog. This agrees with the estimate from 
lAnderson et al.l (|2007f ) of ~ 5% contamination for their 
sample. 

3.4. Galaxies 

Matching SDSS main sample galaxies to the RASS 
results in 3169 total matches. In contrast to quasars, 
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the RASS-SDSS galaxy source-separation histogram rises 
quickly, but is then relatively flat out to 60", as seen 
in Fig. This suggests that while some galaxies are 
detected as X-ray sources, a large fraction are simply 
random associations. We model the RASS-SDSS galaxy 
source-separation histogram following the procedure out- 
lined for quasars above. In this case, the positional errors 
are those of the RASS-SDSS galaxy matched catalog. 
Fig. [5] compares this model with the actual histogram. 
Note that the simulated true match distribution is some- 
what wider than the equivalent quasar curve, as RASS 
sources associated with galaxies have a lower mean flux 
and thus have larger positional errors. The \ 2 per degree 
of freedom of the total fit (simulated+random) is 1.18 for 
galaxies. 

To reduce the effect of source confusion in our RASS- 
SDSS galaxy sample, we remove from our matched 
galaxy catalog RASS sources that are also positionally 
matched with likely X-ray emitters. Our meth od is 
similar to that employed by lAgiieros et all (|2006h who 
removed RASS sources that overlapped with spectro- 
scopically identified quasars, blue point sources (poten- 
tial quasars), bright objects (ROSAT contaminant) and 
sources with a quasar-like X-ray/optical spectral slope. 
Our requirements are more relaxed, as our aim is not to 
eliminate all obvious x-ray sources, but rather to identify 
X-ray counterparts of galaxies. Thus, we only remove 
RASS sources from our matched galaxy catalog that are 
close to the most reliable RASS cross-matches: within 
40" of an SDSS quasar or within 30" of an object in the 
"blue2" list described above. Also, if two SDSS galax- 
ies match to one RASS source, we take only the nearest 
match. This reduces the sample to 1970 galaxies, with 
many obviously incorrect matches removed, such as the 
"match" shown in Fig. [31 This "cleaned" catalog im- 
proves the x 2 of the simulation+random fit to 0.96 and 
is the catalog employed in the analysis that follows. 

4. RASS DETECTIONS BY GALAXY SPECTROSCOPIC 
CLASS 

One would expect galaxies with different optical spec- 
troscopic classes to produce different X-ra y fluxes and 
thus t o have different matching fractions. lObric et al.l 
(2006) list the RASS matching fractions for SDSS galax- 
ies showing no emission as well as AGN, star-forming 
and unknown emission- line galaxies. They also plot their 
RASS matches on an emission-line classification diagram 
analogous to the left-most plot in Fig. [TJ However, 
they do not discuss random matches, nor do they re- 
move known invalid matches (e.g. quasars). Thus, their 
sample includes many SDSS galaxies which are unlikely 
to be true matches to RASS sources. To investigate the 
connection between RASS detection likelihood and op- 
tical spectroscopic class, we separate the cleaned RASS- 
SDSS galaxy catalog into subclasses as described in sec- 
tion 12.1.21 For each of these subclasses, we simulate 
the source-separation histogram via their corresponding 
RASS positional errors and linear random components 
as before, and list the \ 2 of the fits in Table [TJ 

Fig. [7] compares the actual and simulated distributions 
for the different galaxy classes. The left plot shows the 
four different types of classified emission- line galaxies, 
while the right plot shows the unclassified and passive 
galaxies. The thin red curves show the true matching 



fraction at a given radius. Note the high true matching 
fraction for galaxies with potentially significant optical 
emission from a central accretion source: the Seyfert, 
LINER and transition objects. Also note the relatively 
high true matching fraction for unclassified emission and 
passive galaxies. Galaxies with their optical emission 
dominated by star formation have a very small true 
matching fraction; though there are a large number of 
RASS-SDSS matches for H II and unclassifiable galaxies, 
most of those matches are purely random associations. 

We list the detection fractions for the various spectral 
classes in Table[2j including quasars for comparison. This 
detection fraction is the integrated simulation curve di- 
vided by the total number of galaxies in that class. Note 
the relatively high detection fraction for galaxies with 
AGN-dominated optical emission, including the tansition 
objects. The large X-ray detection fraction for unclassi- 
fied emission sources (defined in Section [2.1. 2p suggests 
that many of these objects harbor obscured accretion. 

The number of passive galaxy, unclassifiable galaxy, 
and LINER matches to RASS are slightly under- 
predicted by the model at moderate radii (20 — 40"). 
Visual inspection of these galaxies confirms that some 
of them are in or near clusters, which would produce 
an X-ray source near to, but not coincident with, the 
galaxy. We do not have a cluster catalog to remove 
these "contaminants" but a visual tally shows that be- 
tween half and two-thirds of the RASS-matched pas- 
sive galaxies may be contaminated by the presence of 
a galaxy cluster. However, some of these galaxies ap- 
pear to be field galaxies, and thus we may be finding 
X-ray bright, Optically Normal Galax ies (XBONGs, see 
iGeorgantopoulos fc Geo rgakakis I2005T ). We plan to ex- 
amine these ob jects in more detail in future work. 

We list the RASS-SDSS source-separation radii at var- 
ious fixed matching fractions in Tabic [3J for all the ob- 
jects discussed in this paper. Notice that at no radius 
do H II galaxies show even a 50% true matching frac- 
tion. The true matching fraction for star forming galax- 
ies is extremely low because such galaxies do not produce 
X-rays at a level detectable by the RASS and/or be- 
cause the X-rays they produce are completely obscured 
by dust. Because nearly all RASS-SDSS star-forming 
galaxy matches are due to random associations, no claims 
can be made about X-ray emitting star-forming galaxies 
from these data alone. The XMM-Slew survey, XMM- 
Newton serendipitous source catalog and the Swift BAT 
catalog all observe at higher X-ray energies (less attenu- 
ated by dust), and so could help clarify the X-ray emis- 
sion properties of these galaxies. 

For comparison with previous studies, we give the cu- 
mulative true matching fraction at fixed radii in Table 5) 
These values are computed from the ratio of the integrals 
of the simulated and total curves in Fig. in contrast 
with the previous table, derived from the point-wise ra- 
tios. Again, note that H II galaxies have a very small 
cumulative true-match fraction, even at small radii. All 
other matched sub-samples, except for the unclassifiable 
galaxies, contain more than 85% true RASS matches be- 
low 20". 

5. FUTURE DIRECTIONS: XMM-SLEW 

For comparison with RASS, we have matched the 
XMM-slew clean catalog (first release) to both SDSS 
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Fig. 7. — ROSAT/SDSS galaxy source separation by galaxy sub-type. Each sub-plot follows the structure of the individual quasar and 
galaxy plots shown previously. The original matched distribution (black, thick with la Poisson errors), simulation (blue, short-dash) and 
random (purple, dot) distributions produce the to tal s imulated distribution (green, long-dash). The thin solid curve (red, right axis) gives 
the "true matching fraction" described in Section 1 3 . 3 1 



TABLE 1 

X 2 PER DEGREE OF FREEDOM 



all 


passive 


unclassifiable emission unc. emission H II 


transition 


LINER 


Seyfert 


x 1 1-01 


1.24 


1.14 0.47 0.90 1.53 


1.47 


0.47 


1.00 






TABLE 2 












X-RAY DETECTION FRACTIONS (PERCENT) 








quasar all 


passive 


unclassifiable emission unc. emission H II 


transition 


LINER 


Seyfert 


8.3 .12 


.41 


.11 .11 .28 .004 


.19 


.41 


.66 



TABLE 3 

Source-separation distance at fixed true matching fraction 



fraction 


quasar 


all 


passive 


unclassifiable 


emission 


unc. emission 


H II transition 


LINER 


Seyfert 


85% 


33 


6 


14 




11 


21 


16 


17 


23 


70% 


10 


15 


24 


13 


25 


29 


24 


25 


30 


50% 


47 


23 


32 


21 


32 


37 


32 


32 


38 



Note. — Listed here are the maximum radii (expressed in arcseconds) for a given matching fraction, based on the true match fraction 
shown in Fig. [7] with quasars for comparison. 



TABLE 4 

Cumulative true matching fraction at fixed radius 



radius 


quasar all 


passive 


unclassifiable 


emission 


unc. emission 


H II 


transition 


LINER 


Seyfert 


40" 


96.9 53 


71 


48 


64 


84 


7 


75 


77 


88 


30" 


98.1 64 


79 


59 


74 


89 


10 


83 


85 


92 


20" 


99.0 75 


86 


70 


83 


94 


16 


89 


91 


95 



Note. — Listed here are the fractions of each sample that are true matches, at the given radius, computed by integrating the curves 
shown in Fig. [7] with quasars for comparison. 



galaxies and quasars. Fig. [5] plots the source-separation 
histogram for these sources. The total number of 
matches is quite small, due to the small number of XMM- 
slew sources and the small overlap area between the sur- 
veys. Because of the nature of the XMM-slew survey, 
we cannot perform the same analysis as above; the nar- 
row width of the slew strips is too small for a reliable 
random fraction to be determined, yet. From the source- 
separation histogram, 20" appears to be a reliable cut-off 
for true matches. Accepting only those matches within 
this radius results in 38 galaxy matches and 115 quasar 



matches to XMM-slew. 

A coverage map for the XMM-slew data is not yet 
available, so it is not possible to determine the percentage 
of ROSAT detections that are non-detections in XMM- 
slew. However, amongst the 38 "reliable" matches are 
member(s) of each galaxy class described above. Most of 
these XMM-slew detections are in the soft band (0.2 — 2 
keV), but there are a few galaxies with a detected hard 
X-ray flux. An example is shown in Fig. [9] an inter- 
acting pair of galaxies optically classified as a Transition 
and a Seyfert. The Transition galaxy shows a hard X-ray 
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Fig. 8. — XMM-slew vs. SDSS cross matches for galaxies and 
quasars. The ~ 8" XMM-slew positional errors are readily visible 
in the quasar matches. 
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Fig. 9. — An interacting galaxy pair not detected in soft X- 
rays. The central-source optical classifications are marked. The 
SDSS optically-classified transition galaxy is found in the XMM- 
slew survey at only hard (2-12keV) energies. The stripe on the left 
of the GALEX image is due to detector edge effects. The pair is 
UGC 08327. 

flux and substantial radio point source, while the Seyfcrt 
is unidentified in hard and soft X-rays and shows a ~ 2a 
detection in the FIRST catalog. There is no RASS source 
at this location. We plan to followup on this intriguing 
pair to better understand their emission properties and 
spectral shape. 

6. CONCLUSIONS 

We have examined the matching statistics between the 
ROSAT All Sky Survey and the SDSS main galaxy sam- 
ple. Our technique — simulating the RASS-SDSS source- 
separation via the RASS positional errors plus a lin- 
ear random component — can reproduce the measured 
source-separations for RASS-SDSS quasar matches as 
well as RASS-SDSS galaxies and subclassifications of 
galaxies. We find that the likelihood of a given cross- 
match match being a true match depends strongly on the 
optical spectral classification of a given galaxy. We find 
that essentially no optically classified star-forming galaxy 



has a true RASS counterpart, while LINERs, Seyfert 
2s and Transition and unclassified emission galaxies do 
have reliable X-ray detections. We also find a surprising 
number of galaxies lacking optical emission lines which 
appear to be detected in the RASS. A complete, low- 
redshift SDSS galaxy cluster catalog could be used to 
clarify these XBONG candidates. 

Our technique can be applied to any cross-matching 
between two surveys. The only requirement is that the 
positional errors of each measurement be known; no ar- 
bitrary fitting parameters are needed. By comparing the 
observed source-separation histogram with a linear com- 
bination of the probability distribution functions com- 
puted from the positional errors and a random matched 
catalog, a "true matching fraction" can be determined 
for any two matched catalogs. This is not limited to 
X-rays: as a test, we were also able to reproduce the 
source-separation histogram for a matched catalog of 
SDSS spectroscopic stars and GALEX UV sources. The 
technique works best for catalogs containing mostly point 
sources, as centroiding extended sources can be difficult 
and the centers of sources may be wavelength-dependent. 
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APPENDIX 
SDSS GALAXY SELECTION 



The main galaxy sample does not contain all the galaxies with spectra: a galaxy could also have a spectrum taken if 
it is within 2" of a FIRST radio source or within the error-circle (10 — 30") of a RASS source. Luminous red galaxies 
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Fig. 10. — SDSS Galaxies with RASS matches within 1'. Note the sharp drop at 30" within the sample of objects spectroscopically 
classified as galaxies compared to the main galaxy sample (all galaxies with r pe tro < 17.77). Some galaxies which are not in the main 
sample were targeted specifically because they were within 30" of a RASS source. 

are selected for follow-up spectra based on their position in the (g-r, r-i, i) color-color-magnitude cube. Spectra are 
also taken for a variety of serendipitous sources including low surface-brightness galaxies. These other sources are 
all dimmer than 17.77 in the r-band, and biased toward AGN and star-forming galaxies. The systematics of these 
serendipitous sources are poorly understood. 

The primary method for downloading large data sets from SDSS is CasJobs 3 . To extract the main galaxy sample 
from SDSS CasJobs, use the SpecObj parameter ObjType and select those objects classified as "Galaxy" . This includes 
all objects that were targeted for spectroscopy because they met the main galaxy sample criterion. This classification 
is before the spectra were taken, and is thus a uniform sample. A more naive selection might be to take all objects 
spectroscopically classified as galaxies: those with SpecObj parameter SpecClass listed as "Galaxy". However, this 
sample includes all objects with a galaxy-like spectrum, which includes objects targeted for the above reasons in 
addition to the main galaxy sample. 

In Fig. [10] we show the source-separation histogram for these two different samples. The "photometric" sample is 
the main galaxy sample used in this study. The spectroscopic sample, with a peak at 30", includes objects specifically 
targeted because they were near a RASS source. The fiber-selection process allocates spare spectroscopic fibers to 
sources within 30" of a RASS source. These objects, having SpecObj parameter ObjType classificati ons "ROSAT_A" , 
"ROSAT.B" , "ROSAT.C" or "ROSAT.D" account for roughly 2% of all objects with spectra in SDSS. lStoughton et all 
(2002) claim over half of these ROSAT-based targets turn out to be quasars or AGN. This results in a factor of two 
increase in potential matches at matching radii below 30" . This is why a statistical analysis of RASS matches to SDSS 
must stick with the main galaxy sample; the other sources were selected non-uniformly, and though they may result 
in odd and interesting spectra, they produce a strong bias in X-ray matching properties. 
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